Efficient Hardware for Tile-Based Rasterization

نویسندگان

  • Dan Crisu
  • Sorin Cotofana
  • Stamatis Vassiliadis
  • Petri Liuha
چکیده

An efficient logic-enhanced memory architecture is presented that solves existing problems associated with 3D graphics tile-based hardware rasterization algorithms. The memory contains the same number of bits as the number of pixels in the tile, and during rasterization time it is filled up in several clock cycles by a systolic primitive scanconversion subsystem with the stencil of the primitive: ones are written for memory locations that represent tile pixels covered by the primitive, otherwise zeros are stored. Once the shape of the primitive has been coded inside the memory, the memory internal logic is capable of delivering, on request, up to four hit positions (tile positions inside the primitive) per clock cycle to the pixel processing pipelines, signaling when all the hit positions were consumed. Employing our proposed memory architecture no searching overhead is needed to find the first hit position inside the primitives. Furthermore “ghost” primitives are handled efficiently with a small constant delay irrespective of the primitive size. Finally, hit positions (communicated in a spatial pattern to increase texture cache hit ratios) can always be mapped to different memory banks in the Z-buffer or colorbuffer breaking the “read-modify-write” dependency associated with depth test and color blending. Hardware implementation in a commercial 0.18μm process technology for a QVGA 3D graphics hardware accelerator with a tile size of 32× 16 pixels has indicated that the memory can be clocked at 200MHz and consumes an area of 120000μm. Keywords— 3D graphics architectures; tile-based rasterization; embedded systems; memory architectures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Depth of Field Rasterization Using a Tile Test Based on Half-Space Culling

For depth of field rasterization, it is often desired to have an efficient tile versus triangle test, which can conservatively compute which samples on the lens that need to execute the sample-in-triangle test. We present a novel test for this, which is optimal in the sense that the region on the lens cannot be further reduced. Our test is based on removing half-space regions of the (u,v)-space...

متن کامل

Conservative and Tiled Rasterization Using a Modified Triangle Set-Up

Several algorithms that use graphics hardware to accelerate processing require conservative rasterization in order to function correctly. Conservative rasterization stands for either overestimating or underestimating the size of the triangles. Overestimation is carried out by including all pixels that are at least partially overlapped by the triangle, whereas underestimation includes only the p...

متن کامل

Adaptive Ray-bundle Tracing with Memory Usage Prediction: Efficient Global Illumination in Large Scenes

This paper proposes an adaptive rendering technique for ray-bundle tracing. Ray-bundle tracing can be done by per-pixel linked-list construction on a GPU rasterization pipeline. This rasterization based approach offers significant benefits for the efficient generation of light maps (e.g., hardware acceleration, tessellation, and recycling of shaders used in real-time graphics). However, it is i...

متن کامل

Hyperplane Culling for Stochastic Rasterization

We present two novel culling tests for rasterization of simultaneous depth of field and motion blur. These tests efficiently reduce the set of xyuvt samples that need to be coverage tested within a screen space tile. The first test finds linear bounds in utand vt-space using a separating line algorithm. We also derive a hyperplane in xyuvtspace for each triangle edge, and all samples outside of...

متن کامل

Suitability of tile-based rendering for low-power 3d graphics accelerators

I n this dissertation, we address low-power high performance 3D graphics accelerator architectures. The purpose of these accelerators is to relieve the burden of graphical computations from the main processor and also to achieve a better energy efficiency than can be achieved by executing these computations on the main processor. Since external data traffic is a major source of power consumptio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004